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Nayeem Islam, Murthy Devarakonda 

October 1996 Communications of the ACM, Volume 39 issue 10 

Full text available: f^) pdf(242.40 KB) Additional Information: full citation, references, index terms 



2 Fault-Tolerant Software for Real-Time Appiications H 
H. Hecht 

December 1976 ACM Computing Surveys (CSUR), volume 8 issue 4 

Full text available: ^pdf(1.:43.fy1Bi Additional Information: fulJ„cjtation 1 refeiences, citin-gs, index terms 



3 Industrial sessions: beyond relational tables; Coordinating backup/recovery and data H 
consistency between database and file systems 

Suparna Bhattacharya, C. Mohan, Karen W. Brannon, Inderpal Narang, Hui-I Hsiao, 
Mahadevan Subramanian 

June 2002 Proceedings of the 2002 ACM SIGMOD international conference on 
Management of data 

Full text available: ^pdf<1.44 MB) Additional Information: full citation, abstract, references, index terms 

Managing a combined store consisting of database data and file data in a robust and 
consistent manner is a challenge for database systems and content management systems. 
In such a hybrid system, images, videos, engineering drawings, etc. are stored as files on a 
file server while meta-data referencing/indexing such files is created and stored in a 
relational database to take advantage of efficient search. In this paper we describe 
solutions for two potentially problematic aspects of such a data ... 

Keywords: DB2, content management, database backup, database recovery, datalinks 



Computing the performabilitv of layered distributed systems with a management 
architecture 

Olivia Das, C. Murray Woodside 

January 2004 ACM SIGSOFT Software Engineering Notes , Proceedings of the fourth 

international workshop on Software and performance, volume 29 issue l 
Full text available: ^pdf(.M277.KBj Additional Information: Ml.cjtati.on, abstract references 

This paper analyzes the performability of client-server applications that use a separate fault 
management architecture for monitoring and controlling of the status of the application 



h 



c g e cf c 



Results (page 1): backup applic^jj^ state Paj 



software and hardware. The analysis considers the impact of the management components 
and connections, and their reliability, on performability. The approach combines minpath 
algorithms, Layered Queueing analysis and non-coherent fault tree analysis techniques for 
efficient computation of expected reward rate of the ... 

Keywords: distributed systems, layered queueing networks, non-coherent fault trees, 
performability, system fault-tolerance 



5 A methodology for fast PC hard disk state restoration 
David D. Langan, Thomas J. Scott 

March 1992 Proceedings of the 1992 ACM/SIGAPP symposium on Applied computing: 
technological challenges of the 1990's 

Full text available: f |) pdff676.PS KB) Additional Information: full citation, references, index terms 
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Krishna Phani Gummadi, Madhavarapu Jnana Pradeep, C. Siva Ram Murthy 
February 2003 IEEE/ ACM Transactions on Networking (TON), volume n issue l 

Full text available: B ^od£606/I8_KB) Additional Information: full citation, abstract, references, index terms 

Several distributed real-time applications (e.g., medical imaging, air traffic control, and 
video conferencing) demand hard guarantees on the message delivery latency and the 
recovery delay from component failures. As these demands cannot be met in traditional 
datagram services, special schemes have been proposed to provide timely recovery for 
real-time communications in multihop networks. These schemes reserve additional network 
resources (spare resources) a priori along a backup channel ... 

Keywords: backup channel, backup multiplexing, dependable connection, multihop 
network, primary channel, quality-of-service (QoS), real-time communication, resource 
reservation protocol (RSVP), segmented backup 
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Yuanyuan Zhou, Peter M. Chen, Kai Li 

May 1999 Proceedings of the 13th international conference on Supercomputing 
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High speed on-line backup when using logical log operations 
David B. Lomet 

May 2000 ACM SIGMOD Record , Proceedings of the 2000 ACM SIGMOD international 
conference on Management of data, volume 29 issue 2 

Additional Information: .fulicjtation, abstract references, citings, index 
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m terms 

Media recovery protects a database from failures of the stable medium by maintaining an 
extra copy of the database, called the backup, and a media recovery log. When a failure 
occurs, the database is "restored" from the backup, and the media recovery log is used to 
roll forward the database to the desired time, usually the current time. Backup must be 
both fast and "on-line", i.e. concurrent with on-going update activity. Conventional online 
backup sequentially copies ... 
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R. Studer 

March 1984 Proceedings of the 7th international conference on Software engineering 
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We introduce formal, abstract models for specifying modern dialogue concepts offered by 
dialogue interfaces. The dialogue concepts considered in this paper are menus, forms, and 
windows. Using these abstract models a totally formal definition of man-machine 
interactions and screen layouts is achieved. Thus the semantics of user actions can be 
formalized. The specification method we are using is the Vienna Development Method. 
Examples are taken from the Application Development and Support Sy ... 

10 Management of a..remote„backup„copy„ 

Richard P. King, Nagui Haiim, Hector Garcia-Molina, Christos A. Polyzois 
May 1991 ACM Transactions on Database Systems (TODS), volume 16 issue 2 

Additional Information: full citation , abstract, references, citings, index 



Full text available: |g pdf(2.4S MB) 

A remote backup database system tracks the state of a primary system, taking over 
transaction processing when disaster hits the primary site. The primary and backup sites 
are physically isolated so that failures at one site are unlikely to propogate to the other. For 
correctness, the execution schedule at the backup must be equivalent to that at the 
primary. When the primary and backup sites contain a single processor, it is easy to 
achieve this property. However, this is harder to do when ... 

Keywords: database initialization, hot spare, hot standby, remote backup 



11 A. NpnS top. kernel 
Joel F. Bartlett 

December 1981 Proceedings of the eighth ACM symposium on Operating systems 
principles 

Additional Information: full citation, abstract, references , citings, index 



Full text available: Tl odf{757. 37 KB) 

The Tandem NonStop System is a fault-tolerant [1], expandable, and distributed computer 
system designed expressly for online transaction processing. This paper describes the key 
primitives of the kernel of the operating system. The first section describes the basic 
hardware building blocks and introduces their software analogs: processes and messages. 
Using these primitives, a mechanism that allows fault-tolerant resource access, the 
process-pair, is described. The paper concludes with some ... 

12 Process backup in producer-consumer systems 
David L. Russell 

November 1977 Proceedings of the sixth ACM symposium on Operating systems 
principles 

_ ii , . *, u. « ,t/^o ch Additional Information: full citation , abstract, references , citings , index 

Full text available: f£| pdf(613.51 KB) : 
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System state restoration after detection of an error is discussed for producer-consumer 
systems, with emphasis on the control of the domino effect. Recovery primitives MARK, 
RESTORE, and PURGE are proposed that, in conjunction with the use of SEND-RECEIVE 
interprocess communication primitives, allow bounds to be placed on the amount of 
unnecessary restoration that can occur as a result of system state restoration. 

Keywords: Asynchronous programming, Domino effect, Error recovery, Interprocess 
communication, Message facilities, Software fault tolerance, State restoration 
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Network applications and services need to be environment-aware in order to meet quality- 
of-service requirements in an increasingly dynamic world. In this paper we consider 
partition awareness as an instance of environment awareness in network applications that 
need to be reliable and self-managing. Partition-aware applications dynamically reconfigure 
themselves and adjust the quality of their services in response to network partitions and 
merges. As such, they can automatically ada ... 

14 An execution model for distributed object-oriented computation 
Edward H. Bensley, Thomas J. Brando, Myra Jean Prelle 

January 1988 ACM SIGPLAN Notices , Conference proceedings on Object-oriented 

programming systems, languages and applications, volume 23 issue n 
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Full text available: T%a pdf(7S6.18 KB) ; 

terms 

This paper describes an execution model being developed for distributed object-oriented in 
a message-passing multiple-instruction/multiple-data-stream (MIMD) environment. The 
objective is to execute an object-oriented program as concurrently as possible. Some 
opportunities for concurrency can be identified explicitly by the programmer. Others can be 
identified at compile time. There are some opportunities for concurrency, however, that can 
only be discovered at runtime because they are data ... 

15 A reusable lightweight executive for command and control systems 
Nathan Fleener, Laura Moody, Mary Stewart 

November 1998 ACM SIGAda Ada Letters , Proceedings of the 1998 annual ACM SIGAda 

international conference on Ada, volume xvm issue 6 
Full text available: f§ pdf(67S.14 KB) Additional Information: ML^tatjon, references, jndextemis 
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17 Towards a fault-tolerant multi-agent system architecture 
Sanjeev Kumar, Philip R. Cohen 

June 2000 Proceedings of the fourth international conference on Autonomous agents 
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Keywords: autonomy, fault-tolerance, multi-agent systems, teamwork 
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Martin Mikelsons 

January 1975 Proceedings of the 2nd ACM SIGACT-SIGPLAN symposium on Principles 
of programming languages 

Full text available: pdf(875.01 KB) Additional Information: full citation, abstract, references , citings 

This paper describes a system being developed to bridge the gap between an application 
program and a user inexperienced in the ways of computers. The user explores the 
characteristics of the available programs by a natural language dialogue with the system. 
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The dialogue is supported by a knowledge base covering both the program semantics and 
the application domain. This paper addresses the problems of representation and inference 
involved in this approach and describes our solution for them. 
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